Search CORE

84 research outputs found

2010 ISCB Overton Prize Awarded to Steven E. Brenner

Author: AG Murzin
BJ Morrison McKay
BP Lewis
Clare Sansom
JE Stajich
JM Chandonia
LF Lareau
SE Brenner
Publication venue: Public Library of Science
Publication date: 01/06/2010
Field of study

Crossref

Directory of Open Access Journals

PubMed Central

Structure-based inference of molecular functions of proteins of unknown function from Berkeley Structural Genomics Center

Author: A Bateman
A Roberts
CA Hutchison
D Baker
Debanu Das
DH Shin
DH Shin
DH Shin
Dong Hae Shin
IG Choi
In-Geol Choi
J Hou
J Liu
J Liu
J Liu
J Liu
Jingtong Hou
JM Chandonia
JM Chandonia
JM Chandonia
JM Chandonia
John-Marc Chandonia
JS Kim
JS Kim
K Liolios
KK Kim
KY Hwang
L Holm
L Huang
R Service
RD Finn
RL Tatusov
Rosalind Kim
S Chen
SH Kim
Sung-Hou Kim
T Numata
TI Zarembinski
U Schulze-Gahmen
U Schulze-Gahmen
V Oganesyan
V Oganesyan
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Towards a comprehensive structural coverage of completed genomes: a structural genomics viewpoint

Author: A Andreeva
A Bateman
A Elofsson
A Lupas
A McPherson
A Sali
AE Todd
AE Todd
AJ Enright
B Rost
C Sander
C Vogel
CA Orengo
CH Wu
Christine A Orengo
D Baker
D Busso
D Vitkup
DT Jones
DT Jones
FMG Pearl
GA Reeves
I Letunic
IV Grigoriev
J Liu
J Liu
J Park
J Thornton
J Westbrook
JA Ranea
JC Norvell
JC Wootton
JD Watson
JM Chandonia
JM Chandonia
JM Chandonia
K Karplus
KT Simons
M Linial
M Skovgaard
N Siew
PJ Kersey
R Sanchez
RA Laskowski
RC Stevens
RC Stevens
RI Sadreyev
RL Marsden
Russell L Marsden
SA Lesley
SE Brenner
SE Brenner
SH Kim
SK Burley
SK Burley
SR Eddy
TC Terwilliger
Tony A Lewis
W Minor
W Tian
Y Kim
Y Yan
Publication venue: BioMed Central
Publication date: 01/03/2007
Field of study

BACKGROUND: Structural genomics initiatives were established with the aim of solving protein structures on a large-scale. For many initiatives, such as the Protein Structure Initiative (PSI), the primary aim of target selection is focussed towards structurally characterising protein families which, so far, lack a structural representative. It is therefore of considerable interest to gain insights into the number and distribution of these families, and what efforts may be required to achieve a comprehensive structural coverage across all protein families. RESULTS: In this analysis we have derived a comprehensive domain annotation of the genomes using CATH, Pfam-A and Newfam domain families. We consider what proportions of structurally uncharacterised families are accessible to high-throughput structural genomics pipelines, specifically those targeting families containing multiple prokaryotic orthologues. In measuring the domain coverage of the genomes, we show the benefits of selecting targets from both structurally uncharacterised domain families, whilst in addition, pursuing additional targets from large structurally characterised protein superfamilies. CONCLUSION: This work suggests that such a combined approach to target selection is essential if structural genomics is to achieve a comprehensive structural coverage of the genomes, leading to greater insights into structure and the mechanisms that underlie protein evolution

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

UCL Discovery

FLORA: a novel method to predict protein function from structure in diverse superfamilies

Predicting protein function from structure remains an active area of interest, particularly for the structural genomics initiatives where a substantial number of structures are initially solved with little or no functional characterisation. Although global structure comparison methods can be used to transfer functional annotations, the relationship between fold and function is complex, particularly in functionally diverse superfamilies that have evolved through different secondary structure embellishments to a common structural core. The majority of prediction algorithms employ local templates built on known or predicted functional residues. Here, we present a novel method (FLORA) that automatically generates structural motifs associated with different functional sub-families (FSGs) within functionally diverse domain superfamilies. Templates are created purely on the basis of their specificity for a given FSG, and the method makes no prior prediction of functional sites, nor assumes specific physico-chemical properties of residues. FLORA is able to accurately discriminate between homologous domains with different functions and substantially outperforms (a 2–3 fold increase in coverage at low error rates) popular structure comparison methods and a leading function prediction method. We benchmark FLORA on a large data set of enzyme superfamilies from all three major protein classes (α, β, αβ) and demonstrate the functional relevance of the motifs it identifies. We also provide novel predictions of enzymatic activity for a large number of structures solved by the Protein Structure Initiative. Overall, we show that FLORA is able to effectively detect functionally similar protein domain structures by purely using patterns of structural conservation of all residues

CiteSeerX

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

UCL Discovery

PubMed Central

Exhaustive assignment of compositional bias reveals universally prevalent biased regions: analysis of functional associations in human and Drosophila

Author: AL Fink
C Bracken
E Wolf
GO Consortium
HJ Dyson
I Kuznetsov
JC Wootton
JJ Ward
JM Chandonia
MJ Wise
MM Alba
NG Faux
Paul M Harrison
PM Harrison
R Linding
RC Edgar
S Karlin
S Vucetic
SF Altschul
VJ Promponas
Publication venue: BioMed Central
Publication date: 01/01/2006
Field of study

BACKGROUND: Compositionally biased (CB) regions are stretches in protein sequences made from mainly a distinct subset of amino acid residues; such regions are frequently associated with a structural role in the cell, or with protein disorder. RESULTS: We derived a procedure for the exhaustive assignment and classification of CB regions, and have applied it to thirteen metazoan proteomes. Sequences are initially scanned for the lowest-probability subsequences (LPSs) for single amino-acid types; subsequently, an exhaustive search for lowest probability subsequences (LPSs) for multiple residue types is performed iteratively until convergence, to define CB region boundaries. We analysed > 40,000 CB regions with > 20 million residues; strikingly, nine single-/double- residue biases are universally abundant, and are consistently highly ranked across both vertebrates and invertebrates. To home in subpopulations of CB regions of interest in human and D. melanogaster, we analysed CB region lengths, conservation, inferred functional categories and predicted protein disorder, and filtered for coiled coils and protein structures. In particular, we found that some of the universally abundant CB regions have significant associations to transcription and nuclear localization in Human and Drosophila, and are also predicted to be moderately or highly disordered. Focussing on Q-based biased regions, we found that these regions are typically only well conserved within mammals (appearing in 60–80% of orthologs), with shorter human transcription-related CB regions being unconserved outside of mammals; they are also preferentially linked to protein domains such as the homeodomain and glucocorticoid-receptor DNA-binding domain. In general, only ~40–50% of residues in these human and Drosophila CB regions have predicted protein disorder. CONCLUSION: This data is of use for the further functional characterization of genes, and for structural genomics initiatives

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

eScholarship@McGill

A framework for protein structure classification and identification of novel protein structures

Author: AC Martin
AC Murzin
AJ Enright
AP Singh
C Cortes
CA Orengo
D Chivian
D Frishman
G Getz
HK Saini
IN Shindyalov
J Gough
J Hou
JE Gewehr
Jignesh M Patel
JM Chandonia
L Holm
L Holm
L Lo Conte
M Madera
N Beckmann
O Çamoglu
O Çamoglu
P Røgen
R Day
S Cheek
S Van Dongen
T Madej
You Jung Kim
Publication venue: BioMed Central
Publication date: 01/01/2006
Field of study

BACKGROUND: Protein structure classification plays a central role in understanding the function of a protein molecule with respect to all known proteins in a structure database. With the rapid increase in the number of new protein structures, the need for automated and accurate methods for protein classification is increasingly important. RESULTS: In this paper we present a unified framework for protein structure classification and identification of novel protein structures. The framework consists of a set of components for comparing, classifying, and clustering protein structures. These components allow us to accurately classify proteins into known folds, to detect new protein folds, and to provide a way of clustering the new folds. In our evaluation with SCOP 1.69, our method correctly classifies 86.0%, 87.7%, and 90.5% of new domains at family, superfamily, and fold levels. Furthermore, for protein domains that belong to new domain families, our method is able to produce clusters that closely correspond to the new families in SCOP 1.69. As a result, our method can also be used to suggest new classification groups that contain novel folds. CONCLUSION: We have developed a method called proCC for automatically classifying and clustering domains. The method is effective in classifying new domains and suggesting new domain families, and it is also very efficient. A web site offering access to proCC is freely available a

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Combining sequence-based prediction methods and circular dichroism and infrared spectroscopic data to improve protein secondary structure determinations

Author: A Bairoch
AJ Miles
AJ Miles
AJ Miles
B Rost
BA Wallace
BW Mathews
DT Jones
DT Jones
FJ Zhu
G Wang
IYY Koh
JG Lees
JG Lees
JM Chandonia
Jonathan G Lees
K Bryson
KA Oberg
KA Oberg
KC Chou
L Whitmore
N Sreerama
Robert W Janes
RW Janes
S DeJong
SF Altschul
SM Kelly
TZ Sen
W Kabsch
Publication venue: BioMed Central
Publication date: 01/01/2008
Field of study

Abstract Background A number of sequence-based methods exist for protein secondary structure prediction. Protein secondary structures can also be determined experimentally from circular dichroism, and infrared spectroscopic data using empirical analysis methods. It has been proposed that comparable accuracy can be obtained from sequence-based predictions as from these biophysical measurements. Here we have examined the secondary structure determination accuracies of sequence prediction methods with the empirically determined values from the spectroscopic data on datasets of proteins for which both crystal structures and spectroscopic data are available. Results In this study we show that the sequence prediction methods have accuracies nearly comparable to those of spectroscopic methods. However, we also demonstrate that combining the spectroscopic and sequences techniques produces significant overall improvements in secondary structure determinations. In addition, combining the extra information content available from synchrotron radiation circular dichroism data with sequence methods also shows improvements. Conclusion Combining sequence prediction with experimentally determined spectroscopic methods for protein secondary structure content significantly enhances the accuracy of the overall results obtained.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Automated functional classification of experimental and predicted protein structures

Author: A Andreeva
A Godzik
A Stark
AG Murzin
AR Ortiz
B Zhang
D Fischer
D Pal
D Xu
EC Webb
EF Pettersen
F Pazos
GJ Bartlett
H Hegyi
HM Berman
IN Shindyalov
J Gough
JA Di Gennaro
JC Whisstock
JD Thompson
JD Watson
JM Bujnicki
JM Bujnicki
JM Chandonia
JS Fetrow
JV Ponomarenko
K Ginalski
K Ginalski
K Ginalski
K Pawlowski
K Wang
Kai Wang
L Liao
L Rychlewski
L Rychlewski
L Xie
LH Hung
LH Hung
M Ashburner
MJ Ondrechen
N Nagano
N Nagano
R Kuang
Ram Samudrala
S Cheek
SE Brenner
SF Altschul
SK Burley
SR Eddy
WR Pearson
Publication venue: BioMed Central
Publication date: 01/01/2006
Field of study

BACKGROUND: Proteins that are similar in sequence or structure may perform different functions in nature. In such cases, function cannot be inferred from sequence or structural similarity. RESULTS: We analyzed experimental structures belonging to the Structural Classification of Proteins (SCOP) database and showed that about half of them belong to multi-functional fold families for which protein similarity alone is not adequate to assign function. We also analyzed predicted structures from the LiveBench and the PDB-CAFASP experiments and showed that accurate homology-based functional assignments cannot be achieved approximately one third of the time, when the protein is a member of a multi-functional fold family. We then conducted extended performance evaluation and comparisons on both experimental and predicted structures using our Functional Signatures from Structural Alignments (FSSA) algorithm that we previously developed to handle the problem of classifying proteins belonging to multi-functional fold families. CONCLUSION: The results indicate that the FSSA algorithm has better accuracy when compared to homology-based approaches for functional classification of both experimental and predicted protein structures, in part due to its use of local, as opposed to global, information for classifying function. The FSSA algorithm has also been implemented as a webserver and is available at

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

A Database of Domain Definitions for Proteins with Complex Interdomain Geometry

Author: AG Murzin
AG Murzin
AS Siddiqui
C Chothia
CA Orengo
DB Wetlaufer
DB Wetlaufer
DC Phillips
GD Rose
H Hegyi
HM Berman
I Majumdar
Indraneel Majumdar
J Janin
JM Chandonia
K Kamada
L Holm
LH Greene
Lisa N. Kinch
LN Kinch
M Baron
M Levitt
MA Augustin
Mark Isalan
MB Swindells
MH Zehfus
N Alexandrov
Nick V. Grishin
RB Russell
S Veretnik
SF Altschul
WL DeLano
Publication venue: Public Library of Science
Publication date: 01/01/2009
Field of study

Protein structural domains are necessary for understanding evolution and protein folding, and may vary widely from functional and sequence based domains. Although, various structural domain databases exist, defining domains for some proteins is non-trivial, and definitions of their domain boundaries are not available. Here, we present a novel database of manually defined structural domains for a representative set of proteins from the SCOP “multi-domain proteins” class. (http://prodata.swmed.edu/multidom/). We consider our domains as mobile evolutionary units, which may rearrange during protein evolution. Additionally, they may be visualized as structurally compact and possibly independently folding units. We also found that representing domains as evolutionary and folding units do not always lead to a unique domain definition. However, unlike existing databases, we retain and refine these “alternate” domain definitions after careful inspection of structural similarity, functional sites and automated domain definition methods. We provide domain definitions, including actual residue boundaries, for proteins that well known databases like SCOP and CATH do not attempt to split. Our alternate domain definitions are suitable for sequence and structure searches by automated methods. Additionally, the database can be used for training and testing domain delineation algorithms. Since our domains represent structurally compact evolutionary units, the database may be useful for studying domain properties and evolution

CiteSeerX

Crossref

Directory of Open Access Journals

PubMed Central